Word Spotting in Scanned Tamil Land Documents using K-Nearest Neighbor
نویسندگان
چکیده
word spotting is a technique which can extract the text from input image. Here, we implemented on scanned Tamil land documents. Using Gabor feature, we extract the feature values for the input image. The main goal is recognize the text from the document using K nearest neighbor classifier. The features were calculated and the features were combined. Using these features, we can classify and recognized a text using an unsupervised classifier. The Classifier needs training of a database. The database consists of some set of words (printed or scanned). For the classification of the feature vector the k nearest neighbor Classifiers were employed.
منابع مشابه
An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملA Survey on Various Word Spotting Techniques for Content Based Document Image Retrieval
Searching documents for information and retrieval of relevant documents is a basic activity. Various tools are readily available for searching and retrieval from digital documents, but not much robust methods are available for retrieval from historic documents and old manuscripts as they are not digitized but available in scanned formats. Conventional way of retrieval from scanned document imag...
متن کاملWord spotting for handwritten documents using Chamfer Distance and Dynamic Time Warping
A large amount of handwritten historical documents are located in libraries around the world. The desire to access, search, and explore these documents paves the way for a new age of knowledge sharing and promotes collaboration and understanding between human societies. Currently, the indexes for these documents are generated manually, which is very tedious and time consuming. Results produced ...
متن کاملEfficient Search in Document Image Collections
This paper presents an efficient indexing and retrieval scheme for searching in document image databases. In many non-European languages, optical character recognizers are not very accurate. Word spotting word image matching may instead be used to retrieve word images in response to a word image query. The approaches used for word spotting so far, dynamic timewarping and/or nearest neighbor sea...
متن کامل